Translation Dictionaries Triangulation

نویسندگان

  • Alberto Simões
  • Xavier Gómez Guinovart
چکیده

Probabilistic Translation Dictionaries (PTD) are translation resources that can be obtained automatically from parallel corpora. Although this process is simple, it requires the existence of a parallel corpora for the involved languages. Minoritized languages have a limited amount of available resources. For example, while they can have a few parallel corpora, the number of parallel language-pairs uses to be restricted. We defend that if a minoritized language A has a parallel corpus with a language B, and language B has a parallel corpus with another language C, then we can obtain a helpful probabilistic translation dictionary between A and C. In this document we will formalize the probabilistic translation dictionaries triangulation, perform some experiments making the triangulation between Galician, English and Italian, and conclude with an evaluation of the proposed approach.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Pivot-based multilingual dictionary building using Wiktionary

Judit Ács Research Institute for Linguistics Hungarian Academy of Sciences [email protected] Abstract We describe a method for expanding existing dictionaries in several languages by discovering previously non-existent links between translations. We call this method triangulation and we present and compare several variations of it. We assess precision manually, and recall by comparing the ...

متن کامل

Filtering Wiktionary triangles by linear mapping between distributed word models

Triangulation infers word translations in a pair of languages based on translations to other, typically better resourced ones called pivots. This method may introduce noise if words in the pivot are polysemous. The reliability of each triangulated translation is basically estimated by the number of pivot languages (Tanaka and Umemura, 1994). Mikolov et al. (2013b) introduce a method for scoring...

متن کامل

Equivalence in Technical Texts: The Case of Accounting Terms in English-Persian Dictionaries

Translating accounting documents, in general, and accounting terminology, in particular, is not a simple task, especially when the new terms keep created in pace with accounting developments. This study was carried out to find the most common and preferable ways to translate accounting terms from English into Persian. Also, an attempt was made to identify the frequently used patterns of word-fo...

متن کامل

Creating Sentiment Dictionaries via Triangulation

The paper presents a semi-automatic approach to creating sentiment dictionaries in many languages. We first produced high-level goldstandard sentiment dictionaries for two languages and then translated them automatically into third languages. Those words that can be found in both target language word lists are likely to be useful because their word senses are likely to be similar to that of the...

متن کامل

Sharable Formats and Their Supporting Environments for Exchanging User Dictionaries among Different Mt Systems as a Part of Aamt Activities

We, machine translation providers, as members of Asia-Pacific Association for Machine Translation (AAMT), are now establishing environments for sharing and exchanging user dictionaries among different machine translation systems. In order for users to utilize machine translation systems more effectively, we define common formats of user dictionaries, and establish electronic environments availa...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2010